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Unlike other industries in which intellectual property is patentable, the finan- 
cial industry relies on trade secrecy to protect its business processes and meth- 
ods, which can obscure critical financial risk exposures from regulators and 
the public. We develop methods for sharing and aggregating such risk expo- 
sures that protect the privacy of all parties involved and without the need for 
a trusted third party. Our approach employs secure multi-party computation 
techniques from cryptography in which multiple parties are able to compute 
joint functions without revealing their individual inputs. In our framework, 
individual financial institutions evaluate a protocol on their proprietary data 
which cannot be inverted, leading to secure computations of real- valued statis- 
tics such a concentration indexes, pairwise correlations, and other single- and 
multi-point statistics. The proposed protocols are computationally tractable 
on realistic sample sizes. Potential financial applications include: the con- 
struction of privacy-preserving real-time indexes of bank capital and leverage 
ratios; the monitoring of delegated portfolio investments; financial audits; and 
the publication of new indexes of proprietary trading strategies. 
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Introduction 



While there is still considerable controversy over the root causes of the Financial Crisis of 2007- 
2009, there is little dispute that regulators, policymakers, and the financial industry did not have 
ready access to information with which early warning signals could have been generated. For 
example, prior to the Dodd Frank Act of 2010, even systemically important financial institu- 
tions such as AIG and Lehman Brothers were not obligated to report their amount of financial 
leverage, asset illiquidity, counterparty risk exposures, market share, and other critical risk data 
to any regulatory agency. If aggregated over the entire financial industry, such data could have 
played a critical role in providing regulators and investors with advance notice of AIG's un- 
usually concentrated position in credit default swaps, as well as the exposure of money market 
funds to Lehman bonds. Of course, such information is currently considered proprietary and 
highly confidential, and releasing it into the public domain would clearly disadvantage certain 
companies and benefit their competitors. But without this information, regulators and investors 
cannot react in a timely and measured fashion to growing threats to financial stability, thereby 
assuring their realization. 

At the heart of this vexing challenge is privacy. Unlike other industries in which intel- 
lectual property is protected by patents, the financial industry consists primarily of "business 
processes" that the U.S. Patent Office deems unpatentable, at least until recently [T). There- 
fore, trade secrecy has become the preferred method by which financial institutions protect the 
vast majority of their intellectual property, hence their need to limit disclosure of their business 
processes, methods, and data. Forcing a financial institution to publicly disclose its proprietary 
information — and without the quid pro quo of 17-year exclusivity that a patent affords — will 
obviously discourage innovation, which benefits no one. Accordingly, government policy has 
tread carefully on the financial industry's disclosure requirements. 
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In this paper, we propose a new approach to financial systemic risk management and mon- 
itoring via cryptographic computational methods in which the two seemingly irreconcilable 
objectives of protecting trade secrets and providing the public with systemic risk transparency 
can be achieved simultaneously. To accomplish these goals, we develop self-regulated pro- 
tocols for securely computing aggregate risk measures. The protocols are constructed using 
secure multi-party computation tools (2l[3l|4l|5l[6ll3, specifically using secret sharing [8 J. It is 
known from fi6l[2l that general Boolean functions can be securely computed using "circuit eval- 
uation protocols". Since computing any function on real- valued data is approximated arbitrarily 
well by computing a function on quantized (or binary) data, such an approach can theoretically 
be used. However, for arbitrary functions and high precision, the resulting protocols may be 
computationally too demanding and therefore impractical. We show in this paper that for com- 
puting aggregate risk measures based on standard sample moments such as means, variances, 
and covariances — the typical inputs for financial risk measures — simple and efficient protocols 
can be achieved using secret- sharing over large finite fields or directly over the reals. 

With the resulting measures, it is possible to compute the aggregate risk exposures of a group 
of financial institutions — for example, a concentration (or "Herfindahl") index of the credit 
default swaps market, the aggregate leverage of the hedge-fund industry, or the margin-to-equity 
ratio of all futures brokers — without jeopardizing the privacy of any individual institution. More 
importantly, these measures will enable regulators and the public to accurately measure and 
monitor the amount of risk in the financial system while preserving the intellectual property 
and privacy of individual financial institutions. 

Privacy-preserving risk measures may also facilitate the ability of the financial industry 
to regulate itself more effectively. Despite the long history of "self-regulatory organizations" 
(SROs) in financial services, the efficacy of self regulation has been sorely tested by the re- 
cent financial crisis. However, SROs may be considerably more effective if they had access to 
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timely and accurate information about systemic risk that did not place any single stakeholder at 
a competitive disadvantage. Also, the broad dissemination of privacy-preserving systemic risk 
measures will enable the public to respond appropriately as well, reducing general risk-taking 
activity as the threat of losses looms larger due to increasing systemic exposures. Truly sustain- 
able financial stability is more likely to be achieved by such self-correcting feedback loops than 
by any set of regulatory measures. 

Secure Protocols 

Many important statistical measures such as, mean, standard deviation, concentration ratios, 
pairwise correlations can be obtained by taking summations and inner products on the data. 
Therefore, we present secure protocols for these two specific functions. 

We start with a basic protocol to securely compute the sum of m secret numbers. This pro- 
tocol result from an application of secret- sharing [8] and basic probability results. We assume 
that each number belongs to a known range, which we pick to be [0, 1] for simplicity. Recall 
that the operation a modulo m (written a mod m) produces the unique number a + km £ [0, m) 
where k is an integer, e.g., 3.6 mod 2 = 1.6. 

Secure-Sum Protocol 

For i = 1, . . . , m, each party i possesses the secret number xi £ [0, 1] as an input, and the 
output to each party is s = Y^Li x i (where the addition is over the reals). 
The protocol is as follows: 

1. Each pair of parties exchange privately random numbers. Namely, for all i,j with i ^ j, 
party i provides to party j a random number drawn uniformly at random in [0, m]. 

2. For each i, party i adds to its secret number the random numbers it has received from 
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other parties and subtracts the random numbers it has provided to other parties. More 

formally, party % computes Si = Xi + X^o.-.™} Rji ~ Yl >> < -i Rij modm. Each 

party publicly reveals S^ 

3. Each party computes S = YlT=i Si modm, which equals s = YlT=i x i- 



Numerical example. Let m = 3 (i.e., three parties), xi = 0.1, x 2 = 0.2 and x 3 = 0.3. In the first 
round of the protocol, the parties exchange random numbers Rij. For example, 





Party 1 


Party 2 


Party 3 


Party 1 provides 




1.4 


2.1 


Party 2 provides 


1.1 




2.3 


Party 3 provides 


0.3 


2.9 





In the second round, party % adds to its secret number the elements of the i-th column and 
subtract the elements of the i-th row (using modulo 3 arithmetic). Each party publishes the 
result Si. 



Si 


s 2 


s 3 


1 


1.1 


1.5 



Finally, the parties add these numbers (modulo 3) and compute the output sum: 

s = 3.6 mod 3 = 0.6. 

Protocol correctness and secrecy. If the parties follow the protocol correctly, it is easy to check 
that the correct sum is always obtained, since each element Rij is added and subtracted once in 
S. In addition, we show that this protocol reveals nothing else about the secret numbers than 
their sum, even if the parties attempt to infer more from the exchanged data. For example, Party 
1 may try to learn more about other parties' secret numbers by using the information gathered 
in Si, S 2 , S3. We state informally the secrecy guarantee in the following theorem and provide 
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Figure 1: Each point in the plot is a realization of (Si, S 2 , S3) (step 2 in the Secure-Sum proto- 
col) for a drawing of the matrix R, keeping x\ = 0.1, x 2 = 0.2 and x 3 = 0.3 fixed. As illustrated 
by the plot, the set of points (si, s 2 , S3) for which S1+S2+S3 mod 3 = 0.6 is uniformly covered, 
suggesting that the Si's do not carry any other information about the Xj's than their sum. 

a formal statement and proof in the appendix. We first illustrate a weaker fact here by plotting 
the values of Si, S 2 , S 3 for several realizations of the random numbers Rij, while keeping fixed 
xi = 0.1,X2 = 0.2 and £3 = 0.3. As shown in Figure [Tj the realizations of (Si, S 2 , S3) uniformly 
cover the set of points (si, s 2 , s 3 ) for which Si+s 2 +s 3 mod 3 = 0.6, suggesting that there is no 
relevant information in the Sj's other than their sum. 

The following is obtained assuming that parties follow the protocol requirements without 
deviating from it. 

Theorem 1. The Secure-Sum protocol outputs the sum of m privately owned real numbers and 
does not reveal any additional information about the individual numbers. 

This theorem follows directly from secret-sharing [JSj and basic probability results. For 
convenience, we provide a proof in the Appendix. 

Secure-Inner-Product Protocol 

To compute securely the inner product of two real vectors, slightly more sophisticated protocols 
are developed and presented in the appendix, using basic secret sharing [8|, secret-sharing as 



employed in [VlOSJI , and Oblivious Transfer [91H0J . The variants include information-theoretic 
and cryptographic protocols on quantized or real data, and have different attributes discussed in 
the appendix. We state here an informal result regarding one of these protocols which we call 
Secure-Inner- Product protocol 1. 

Theorem 2. The Secure-Inner-Product protocol 1 outputs the sum of two privately owned quan- 
tized vectors and does not reveal any additional information about the individual vectors. 

Note that the previous two theorems hold provided that the parties follow the protocol require- 
ments (without colluding or cheating). Extensions to malicious parties or other type of functions 
can also be developed but are not discussed here. 

Illustrative Example 

To illustrate the practical implementation of privacy-preserving measures, we provide a simple 
numerical example using publicly available quarterly data from June 1986 to December 2010 
(released in arrears by the U.S. Federal Reserve) on the total amount of outstanding loans linked 
to real estate issued by three major bank holding companies: Bank of America, JPMorgan, and 
Wells Fargo [11]. Suppose that the aggregate value of these loans across the three banks is the 
risk exposure of interest, and the magnitude of outstanding loans for each bank is the proprietary 
data to be kept private. The historical time series of these data are displayed in Figure [2(a) , the 



bar graph in blue is the aggregate risk exposure to be computed and the three line graphs are the 
proprietary inputs. 

The desired result can be obtained with an application of the Secure-Sum protocol described 
above [fT2l . which consists of two steps. In the first step, each institution produces two random 
numbers to be shared, one for each of the other two participating institutions. These numbers 
are shown in line graphs of Figure 2(b)| where the color coding indicates the institution gen- 



erating the random numbers. Since these numbers are purely random, there is no relationship 



between them and the private data of Figure 2(a) a fact that is clear from visual inspection of 



the intermediate outputs in Figure 2(b) 



In the second step of the Secure-Sum protocol, each institution uses its private data, the 
two numbers it receives from the other two participating banks, as well as the two numbers it 
sends to the other two institutions to produce a single value, which we refer to as the privacy- 
preserving measure of its private data. This value will be revealed to the other two institutions. 
While these privacy-preserving measures, shown in Figure 2(c)| seem like a pure noise, they 
have just enough of the original data so that the sum of these three numbers under modulo 
arithmetic yields the correct sum of the original inputs. The key here is that the randomness 
produced in the first step, as shown in Figure |2(b)[ exactly cancels in the second step due to 
the way that the protocol in constructed. It is apparent that the aggregate loans outstanding in 
Figure 2(c) is identical to the corresponding graph in Figure 2(a) [ but the former graph has been 



computed using only the privacy-preserving measures of Figure 2(c) 



Despite the fact that the underlying data used in this example is not confidential, even in this 
simple illustrative case privacy-preserving measures may still prove useful in providing financial 
institutions and regulators with an incentive to release the data without a lag. More timely 
releases would obviously benefit all stakeholders by allowing them to respond more nimbly to 
changing market conditions, but such releases could also disadvantage certain parties in favor 
of others if privacy were not assured. Moreover, this example underscores the simplicity with 
which more sensitive data such as leverage ratios, positions in illiquid assets, and off-balance- 
sheet derivatives holdings can be shared regularly, securely, and in a timely fashion. 

We consider only three institutions in this example because it is the simplest non-trivial 
case in which privacy-preserving measures of aggregate sums can be constructed. Clearly, the 
protocol is applicable for any number of participants greater than two, and implementation for 
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even several thousand participants is extremely fast. More complex risk exposures such as 
Herfindahl concentration indices require two applications of the Secure-Sum protocol, but the 
computational burdens are still quite modest. The Secure-Inner-Product protocol can be used 
to construct multi-point statistical measures such as average correlations between changes in 
securities holdings or leverage across industry participants. 

Discussion 

By construction, privacy-preserving measures of financial risk exposures cannot be "reverse- 
engineered" to yield information about the individual constituents. Accordingly, there is no 
guarantee that the individual inputs are truthful. In this respect, the potential for misreporting 
and fraud are no different for these measures than they are for current reporting obligations 
by financial institutions to their regulators, and existing mechanisms for ensuring compliance — 
random periodic examinations and severe criminal and civil penalties for misleading disclosures — 
must be applied here as well. 

However, unlike traditional regulatory disclosures, privacy-preserving measures will pro- 
vide its users with a strong incentive to be truthful because the mathematical guarantee of pri- 
vacy eliminates the primary motivation for obfuscation. Since each institution's proprietary 
information remains private even after disclosure, dishonesty yields no discernible benefits but 
could have tremendous reputational costs, and this asymmetric payoff provides significantly 
greater economic incentive for compliance. Moreover, accurate and timely measures of system- 
wide risk exposures can benefit the entire industry in allowing institutions and investors to en- 
gage in self-correcting behavior that can reduce the likelihood of systemic shocks. For example, 
if all stakeholders were able to monitor the aggregate amount of leverage in the financial system 
at all times, there is a greater chance that market participants would become more wary and less 
aggressive as they observe leverage rising beyond prudent levels. 
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Figure 2: An illustration of a privacy-preserving measure of the aggregate amount of real-estate- 
linked loans outstanding for Bank of America, JPMorgan, and Wells Fargo from June 1986 to 
December 2010. Panel (a) contains the three historical quarterly time series of outstanding 
outstanding loans which is private and the aggregate sum which we wish to compute securely. 
Panel (b) contains the six time series of intermediate numbers that are exchanged bilaterally 
between all pairs. Panel (c) contains the three privacy-preserving values that are shared between 
all banks and used to compute the aggregate sum, which is identical to the aggregate sum in 
Panel (a). 
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A related issue is whether participation in privacy-preserving disclosures of financial risk 
exposures is voluntary or mandated by regulation. Given the extremely low cost/benefit ratio 
of such disclosures, there is reason to believe that the financial industry may well adopt such 
disclosures voluntarily. A case in point is Markit, a successful industry consortium of dealers of 
credit default swaps (CDS) that emerged in 2001 to pool confidential pricing data on individual 
CDS transactions and make the anonymized data available to each other and the public so as 
to promote transparency and liquidity in this market [13J. According to Markit's website, the 
data of its consortium members are ". . .provided on equal terms to whoever wanted to use it, 
with the same data released to all customers at the same time, giving both the sell-side and 
buy-side access to exactly the same daily valuation and risk management information". From 
this carefully crafted statement, it is clear that equitable and easy access to data is of paramount 
importance in structuring this popular data-sharing consortium. Privacy-preserving methods of 
sharing information could greatly enhance the efficacy and popularity of such cooperatives. 

The same motivation applies to the sharing of aggregate financial risk exposures, but with 
even greater stakes as the recent financial crisis has demonstrated. Once a privacy-preserving 
system-risk-exposures consortium is established, the benefits will so clearly dominate the nom- 
inal costs of participation that it should gain widespread acceptance and adoption in short order. 
Indeed, participation in such a consortium may serve as a visible commitment to industry best 
practices that yields tangible benefits for business development, leading to a "virtuous cycle" of 
privacy-preserving risk disclosure throughout the financial industry 

Conclusion 

Privacy-preserving measures of financial risk exposures solve the challenge of measuring ag- 
gregate risk among multiple financial institutions without encroaching on the privacy of any 
individual institution. Previous approaches to addressing this challenge require trusted third 
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parties, i.e., regulators, to collect, archive, and properly assess systemic risk. Apart from the 
burden this places on government oversight, such an approach is also highly inefficient, requir- 
ing properly targeted and perfectly timed regulatory intervention among an increasingly com- 
plex and dynamic financial system. Privacy-preserving measures can promote more efficient 
"crowdsourced" responses to emerging threats to systemic stability, enabling both regulators 
and market participants to accurately monitor systemic risks in a timely and coordinated fash- 
ion, creating a more responsive negative-feedback loop for stabilizing the financial system. This 
feature may be especially valuable for promoting international coordination among multiple 
regulatory jurisdictions. While a certain degree of regulatory competition is unavoidable given 
the competitive nature of sovereign governments, privacy-preserving measures do eliminate a 
significant political obstacle to regulatory collaboration across national boundaries. 

Privacy-preserving risk measures have several other financial and non-financial applications. 
Investors such as endowments, foundations, pension and sovereign wealth funds can use these 
measures to ensure that their investments in various proprietary vehicles — hedge funds, private 
equity, and other private partnerships — are sufficiently diversified and not overly concentrated in 
a small number of risk factors. Financial auditors charged with the task of valuing illiquid assets 
at a given financial institution can use these measures to compare and contrast their valuations 
with the industry average and the dispersion of valuations across multiple institutions. Real- 
time indexes of the aggregate amount of hedging activity in systemically important markets like 
the S&P 500 futures contract may be constructed, which could have served as an early warning 
signal for the "Flash Crash" of May 6, 2010. 

More broadly, privacy-preserving measures of risk exposures may be useful in other in- 
dustries in which aggregate risks are created by individual institutions and where maintaining 
privacy in computing such risks is important for promoting transparency and innovation, such 
as healthcare, epidemiology, and agribusiness. 
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Appendix 

In this appendix, we provide formal theorems and proofs of the security guarantees ensured 
by the Secure-Sum and three Secure-Inner-Product protocols, assuming semi-honest parties 
(possibly curious but following the protocol correctly). Extensions to malicious parties can be 
considered but are not discussed here. 

Secure-Inner-Product protocols 1 and 2 use a third dummy party to help with the compu- 
tations while Secure-Inner-Product protocol 3 does not. The dummy party does not possess 
inputs or receives meaningful information but simply helps with the computation (note that 
for the applications in mind, the use of a dummy party does not represent a significant obsta- 
cle). Secure-Inner-Product protocols 1 and 3 are defined on quantized data, while Secure-Inner- 
Product protocol 2 applies directly to real-valued data. Finally, Secure-Inner-Product protocol 
1 provides information-theoretic security, Secure-Inner-Product protocol 2 provides 'almost' 
information-theoretic security (as defined in Theorem [5]) and both protocols require only ele- 
mentary operations at a computational level, while Secure-Inner-Product protocol 3 provides 
cryptographic security (i.e., it relies on computational-hardness assumptions) and uses OT pro- 
tocols (hence non-elementary operations such as RSA |[T4ll encryptions and decryptions). 

An important benchmark for the practical consideration of secure protocols is the number of 
communication rounds, which require exchange of data over communications media such as the 
internet. With a standard internet connection and for arbitrary distances this can take no longer 
than 2-3 seconds but may also dominate the protocol running time. All protocols proposed here 
require few communication rounds. The following table summarizes these properties, where n 
denotes the vector dimension and q the quantization level. 
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Secure-Inner-Product 3 
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Sum Protocols and Theorems 

For convenience, we restate the Secure-Sum protocol. 
Secure-Sum Protocol. 

Inputs: for % — 1, . . . , m, party i possesses the secret number Xi e [0, 1]. 
Output: each party obtains s = YlT=i x i (where the addition is over the reals). 
Protocol: 

1. Each pair of parties exchange privately random numbers. Namely, for all i, j with % ^ j, 
party % provides to party j a random number drawn uniformly at random in [0, m\. 

2. For each i, party % adds to its secret number the random numbers it has received from 
other parties and subtract the random numbers it has provided to other parties. In formula, 
party % computes Si = xi + ^e{i,...,m} Rji — X^eo ■ >»} Rij modm. Each party publicly 
reveals Si. 

3. Each party computes S = Y1T=\ ^ modm, which equals s = YlT=i x i- 

One can define other variants and extensions of this protocol, in which fewer random num- 
bers are exchanged to minimize information flow, or in which more information is exchanged 
to check the correctness of parties computations (one may also use virtual parties for that). 

Theorem 3. Let xi, . . . , x m be m privately owned real numbers. Let i G {1, . . . , m} and VieWj 
denote the view of party i obtained from the Secure-Sum protocol with inputs x±, . . . , x m . The 
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protocol outputs the sum s = YliLi x i an d me distribution of 'View \ depends on x\, . . . , x m only 
through s and x^ 

We provide first the proof argument for m = 3. Assume that party 1 collects all the data it 
possesses and received from other parties to try to learn something about their secret numbers. 
That is, party 1 possesses its secret number x\, the numbers R\ 2 , R13, R21, R31 exchanged in 
step 1, the numbers Si, S 2 , S3 revealed in step 2 and the output sum s (whose information is 
already contained in the S^'s). From these, party 1 can subtract in S 2 , S 3 the terms depending 
on R 12 , R13, -R21, -R31 and obtain the right-hand side of 

x 2 + (R32 - #23) = S 2 + (R 21 - R 12 ) mod 3 (1) 
£3 - (i? 23 - R32) = S 3 + (i? 3 i - R13) mod 3 (2) 

and this is all the information party 1 can gather about other parties secret numbers. Adding 
these equations provides x 2 + x a = s — x\, i.e., what can be deduced from knowing the sum of 
the secret numbers. To see that nothing else can be inferred from ([T]) or ([2]), note that R 32 — R 23 is 
uniform on [0, m\. However, for any fixed number x G [0, 1], if one adds to it a random number 
R uniformly drawn in [0, m), the number x + R is also uniformly drawn in [0, m]. Therefore, 
Q (or ([2])) does not provide any further information about x 2 (or x 3 ). 

Proof of Theorem^ All the arithmetic in this proof is modulo m. We first check that the proto- 
col computes indeed the sum. We set Ra = for all i, to simply notations. This is straightfor- 
ward since Si = Xi + ^2j(Rji — Rij) and hence, YliLi Si = YliLi x i- Let Viewi be the protocol 
view of party 1, i.e., 

Viewi = {xi, Ru, Ra, Si, VI < i < m}. 

Party 1 can subtract the Ri/s it has access to in the Si's, obtaining View^ as a sufficient statistic 
for Viewi, where 

View; = {xi, I h Vi ^ 1} 
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and 



_/j X{ ~\~ Z{ 

Zi = ^2(Rji - Rij) 

Let us define Z = [Z 2 , . . . , Z m ]* and W = [i? 2 , . . . , -R m ]*, where i?j contains all the i?^ for 
which j i (in increasing order). Note that Z and W are a random vectors of dimension 
respectively (m — 1) x 1 and m(m — 1) x 1. We then have that 

Z = AW - ATIW, 

where A is the (to — 1) x to(to — 1) matrix whose i-th row is filled with O's except at columns 
[i(m — 2) + 1, (i + 1)(to — 2)] where it is 1, and II is a permutation matrix. Note that the rank 
of A and the rank of M := A(I - II) is to - 2, implying that Im(M) = Eg\ where 

m 

E™ := {u 2 ,...,txm G [0,to] : J^u, = 0}. 
Therefore, for any z, d e E m _ l5 there exists w such that Mw = d and 

P{MW < 2 + 4 = P{M(W - w) < z) = ¥{MW < z) 

where the second equality uses the fact that W and W — w are both i.i.d. uniform over [0, to]. 
This shows that Z = MVT is uniform over E™ and / = [I 2 , . . . , 7 m ] is uniform over 

m m 

T, 2 n (x 2 , ...,x m ):= {u 2 , ...,u m e [0, to] -^2ui = ^2 

i=2 i=2 

Therefore, the distribution of View' l5 and hence of Viewi, depends only on Y1T=2 %i = s — x x 
and x\. By symmetry, the analogue conclusion holds for any parties, which concludes the proof 
of the theorem. □ 
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Inner-Product Protocols and Theorems 

We now present secure protocols to compute the sample correlation, or equivalently the inner 
product, between two real vectors. Recall that the sample correlation of two vectors x = {xi}\ =l 
and y = {^} - =1 is given by 



where x = \ £* =1 x it s x = (^E*=i(^ ~ x) 2 ) 1 ' 2 , y = \ £* =1 y» s y = (^E!=i(^ - 
y) 2 ) 1/2 , Xi = {t _l )1/2 (xi-x)/s x andy* = {t _\ )1/2 ( yi -y)/s y . 



Definition 1. We denote by 7h q the set {0,1, ... ,q — 1}, and by ¥ q the same set equipped with 
the Galois field operations when q is a power of a prime. We define by S fc (a;,F g ) the sets of 
k-tuples in ¥ q which add up to x, i.e., 



We may call the yi 's to be shares of x. 
Secure-Inner-Product Protocol 1. 

Common inputs: q E Z + (the quantization level), n e Z + (the vector dimensions) and p a prime 
larger than q 2 n. 

Party 1 inputs: Xi,. . . ,x n G Z q . 
Party 2 inputs: y x , . . . , y n e Z q . 
Party 3 inputs: none. 

1. For i — 1, . . . ,n, party 1 splits x,i in three shares Xj(2) and Xj(3) uniformly drawn 

in S 3 (a;i,Fp) := {(a, b, c) G : a + b + c mod p = x,j} and party 2 splits yi in three 
shares yi(l), y«(2) and j/j(3) uniformly drawn in E 3 (yj,F p ). Party 1 provides privately to 




S fe (a;,F g ) := 



,...,J/ fc )eFj: 



2/i H h mod g = x}. 
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party 2 the shares Xj(l), Xi(2) and privately to party 3 the share Xi(3). Party 2 provides 
privately to party 1 the shares y$(2) and privately to party 3 the share yi(3). 

2. Party 1 setspi(l) = (x < (l)+Zi(3))(j/ i (l)+y<(2)) mod pandp(l) = JXiP^ 1 ) mod P» 
party 2 sets Pi (2) = ^(3)(x 4 (l) + Xi (2)) + x l (2)(y t (l) + yi {2)) mod p and p(2) = 
2^™=iP*( 2 ) mod P> and party 3 sets p; (3) = x { (3)^(3) mod pandp(3) = ^™ =1 Pi(3) mod p. 
For m = 1,2,3, party m splits p(m) in three shares p(m, l),p(m, 2) and p(m, 3) uni- 
formly drawn in £ 3 (p(m), F p ) and reveals privately p(m, k) to party k, for k = 1,2, 3. 

3. For A; = 1,2, 3, party k computes R(k) = Ylm=i P( m > ^) m °d P- Parties 1 and 2 ex- 
change R(l) and i?(2) and party 3 provides i?(3) to parties 1 and 2. Parties 1 and 2 
compute + R{2) + i?(3) = Y!U Wi- 

Theorem 4. Le£ x — [x%, . . . , x n ] and y = [yi, . . . , y n ] be two privately owned vectors on F™. 
Let Viewi denote the view of party 1 obtained from the Secure-Inner-Product protocol 1 with 
inputs x, y. The protocol outputs the inner product p = Y^i=i x iVi an d me distribution of Viewi 
depends on x,y only through p and x. The reciprocal result holds for party 2. 

Proof of Theorem^ The arithmetic is on F p in the following. We first check that the protocol 
computes indeed the inner product. For every i — 1, . . . , n, Pi(l) + Pi(2) + Pi(3) = Xiyi, hence 

n n 

£>i(l) + Pi{2) + Pi (3)) = p(l) + p{2) + p(3) = x iVi- 

i=l i=l 

Moreover, Ylk=i P( m i ^) = P( m )> hence 

3 3 n 

Y R ^ = Y Y p ( m > k ^ = Y p ( m ^ = Y XiVi - 

k=l k=l m=l m=l i=l 

Let Viewi be the protocol view of party 1, which is a function of 

View; = {x, y(l),y(2),p(2, 1), p(3, 1), R(2), R(3)}, 
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where y(l) contains all components yi(l) for i — 1, . . . , n and similarly for the y{2). Note that 
for i = 1, . . . , n, (pi(l),pi(2),pi(3)) are independent and uniformly drawn in £3^, F p ), where 
Pi = Xiyi. Moreover, step 2. and 3. of the protocol are equivalent to running the secure-sum- 
protocol on p(l),p(2),p(3). Hence, from Theorem [3} for any realization of p(l), p(2), p(3), 
the distribution of p(2, 1), p(3, 1), R(2), R(3) depends only on the sum p(l) + p(2) + p(3) = 
2~2i=iPi an d on p(l), where p(l) depends only on x and on y(l),y(2) which are independent 
and uniformly distributed over F p . Therefore, the distribution of View^, hence Viewi, depends 
only on 2~2i=i Pi — P an d on D 

Secure-Inner-Product Protocol 2. 

Common input: n E Z + (the vector dimensions) and r > n 
Party 1 inputs: x±, . . . , x n G [0, 1]. 
Party 2 inputs: y 1: ...,y n e [0, 1]. 
Party 3 inputs: none. 

1. For i — 1, . . . , n, 

(a) party 1 splits in three shares by evaluating a random polynomial t i-> X^t) 
at (ti,t2,t 3 ) = (1/4, 1/2,3/4), where Xi(t) = Xi + a«t mod r and where a« is 
uniformly drawn in [0, t\. Party 1 reveals Xi(tj) to party j for j = 2, 3, 

(b) party 2 splits yi in three shares Yi(tj) = yi + b{tj mod r, for j = 1,2,3, where hi is 
uniformly drawn in [0, r], and reveals Y^tj) to party j for j = 1,3. 

2. For j = 1,2,3, 

(a) party j computes P(tj) = Y^i=i Xi(tj)Yi(tj) mod r, 

(b) party j draws aj, (3j independently and uniformly at random in [0, r] and for k = 
1, 2, 3, sets Zj{tk) = oijtk + (3jt\ mod r and shares Zj(tk) with party k, 
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(c) p(tj) = P(tj) + J2k=i Zk{tj) mod r is made available to parties 1 and 2. 

3. Party 1 and 2 compute p(0) by interpolating a degree 2 polynomial on p(tj), j = 1, 2, 3, 
obtaining p(0) = ]T™ = i ^J/*. 

Theorem 5. Le? } and y = [yi , . . . , y n ] be two privately owned real vectors on 

[0, l] n , where n is fixed. Let Viewi denote the view of party 1 obtained from the Secure-Inner- 
Product protocol 2 (over the reals) with inputs x,y. The protocol outputs the inner product 
p = 2~2i=i x iyi an d the distribution of View i can be approximated arbitrarily close (in total 
variation distance and when r increases) by a distribution depending on x,y only through p 
and x. The reciprocal result holds for party 2. 

We omit the proof of this theorem to conserve space since it does not concern the main scope 
of the paper. We refer to Theorem [4] for a proof of a Secure Inner-Product protocol, which can 
be used on real data via quantization. 

We provide a third protocol to compute securely the inner-product function without using a 
third dummy party but ensuring only cryptographic security. This protocol uses the Oblivious 
Transfer (OT) protocol, developed by (UdOl, which is an important protocol for multi-party 
computations as it allows to compute in particular secret shares of the product x ■ y of two bits x 
and y, and can then be used in the computation of more general circuit computations. The basic 
OT protocol allows a sender to transfer one of potentially many bits to a receiver; however, the 
sender remains oblivious as to what bit the receiver wants and the receiver remains oblivious 
about any other bits than the one he has requested. In other words, the functionality in the OT 
protocol takes the bits (b 1 , . . . , b k ) as inputs for the first party and the index i for the second 
party, and produces as output nothing for the first party and the bit bi requested by the second 
party. Formally, 

OI*((6 1) ... ) 6 fc ) ) i) = (A,6 i ), 
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where A denotes the no information symbol. We now describe OTj. 
OT 2 protocol 

Sender inputs: (b , b±) G {0, l} 2 and a private key (n, d). 
Receiver inputs: i E {0,1} and a public key (n, e). 

Algorithm: 

1 . The sender generates two random numbers x , xi and transmit them to the receiver. 

2. The receiver generates a random number k, encrypts it with the public key and scrambles 
the outcome with Xi to produce c = (x, + k e ) mod n 

3. The sender decrypts the two numbers (c — x ) and (c — xi) to get k and k\ respectively 
(i.e., it computes kj — (c — rr^ mod n for j = 0, 1). Note that either k or ki is equal 
to /c, but these are equally likely for the sender, and reciprocally, k i(B i is not accessible to 
the receiver. The sender then transmits a = b + k and a\ — b\ + k±. 

4. The receiver finds b { = a { — k. 

The OT^ protocol is easily obtained by extending previous protocol to multiple sender bits, ad 
similarly, one can extend the protocol to non binary fields. 

We now present a cryptographic protocol for the inner product. 

Secure-Inner-Product Protocol 3. 

Common inputs: q (the quantization level), n (the vector dimensions). 
Party 1 inputs: x±, . . . , x n G Z q . 
Party 2 inputs: y u . . . ,y n e Z q . 

1 . For % — 1 , . . . , n, 
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(a) party 1 picks Xi(2) uniformly at random in Z ng 2 and reveals it to party 2, who picks 
yi(l) uniformly at random in Z nq 2 and reveals it to party 1. 

(b) party 1 picks a;(l) uniformly at random in Z ng 2 and sends 

{-^(1), -cn(l)+Xi(l), -Oi(l)+2xi(l), -Oi(l)+3xi(l), -a i (l)+(ng 2 -l)^(l)} 

2 

(all operations mod nq 2 ) with OT™ 9 to party 2 who picks the t/j(2)-th element. 

(c) party 2 picks 6^(2) uniformly at random in Z nq 2 and sends 

{-6i(2), -6, (2) +^(2), -6 i (2)+2ar i (2), -6* (2) +3^ (2), . . . , -6 i (2)+(tg 2 -l)x i (2)} 

2 

(all operations mod ng 2 ) with OT™ 9 to party 1 who picks the yj(l)-th element. 

(d) party 1 computes Pi(l) = + cij(l) + 6^(1) mod nq 2 and party computes 
Pi (2) = Xj(2)yj(2)+aj(2)+6j(2) mod ng 2 . Note that these are shares of the product 

2. Party 1 computes p(l) = X^=iP«(l) m °d n( f an d reveals it to party 2, who computes 
P(2) = YH=iPi(^) m °d n< ? 2 an ^ reveals it to party 1. 

3. Each party computes p(l) + p(2) mod nq 2 = Yli=i x iVi- 
From the protocol construction, we have the following result. 

Lemma 1. Secure-Inner-Product protocol 3 privately reduces the correlation computation to 
the OT protocol. 

The notion of being "privately reducible" is formally defined in Section 2.2. of lfT5ll . From 
the composition theorem for the semi-honest setting in Section 2.2. of [fT5ll . one obtains as a 
consequence of the previous lemma that Secure-Inner-Product protocol 3 privately computes 
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Figure 3: Computational circuit for the inner product Yl^=i x iV^ when the inputs are fc-bit 
numbers. 

the inner product provided the existence of trapdoor one-way permutations. In particular, using 
RSA for the encryptions in OT, the protocol is secure provided that RSA cannot be broken. 

This protocol requires 0(nq 2 ) OT protocols but only three communication rounds. This 
still means a possibly high number of public and private encryptions/decryptions (e.g., with 
RSA). One may use [16] to improve the OT protocols running time. Another approach consist 
in using a Boolean circuit for correlations as in Figure |3} using OT protocols to compute shares 
of the multiplication gates (and simply adding shares for the XOR gates). Such an approach, 
as developed in 0, or related approaches as in [|6l [5J, may be particularly useful for other 
functions such as for the quantile function, which does not have the arithmetic structure of 
the summation or inner-product functions. In particular, flUEl provide protocols with constant 
communication rounds which may matter for practical considerations, although for real data 
problems, the practicality of such algorithms need to be further investigated. 

Related literature on MPCs 
Theory 

The problem of secure multi-party computation emerged with the work of Yao [6J in 1982, 
and with the work of Goldreich, Micali and Wigderson [0 in 1987. It is shown in H that 
any Boolean functionality can be computed without requiring an external trusted party for two 
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parties, and |2l provides protocols for arbitrarily many parties. Since these papers, many have 
proposed variations of MPC settings, allowing different kinds of adversarial parties, security, 
and efficiency attributes. In particular, [5] introduces cryptographic protocols with bounded cir- 
cuit depths (requiring finitely many communication rounds) and [0 |3l 01 develop information- 
theoretic protocols. Homomorphic encryption has also been shown to provide another approach 
to secure multi-party computations [fT71[T8l . and more recently, Gentry [fT9l showed that fully 
homomorphic encryption schemes can be constructed, allowing addition and multiplication to 
be performed on encrypted data without having to decrypt it. This approach leads to MPC pro- 
tocols that do not have communication rounds increasing with the circuit complexity, although 
fully homomorphic encryption is still considered impractical. For certain functionality, progress 
regarding practical fully homomorphic encryption have been achieved in [|20l with somewhat 
fully homomorphic encryptions schemes using the learning-with-errors assumption. 

Applications 

The main applications associated with MPCs in the literature include distributed voting [|2T1l . 
private bidding and auctions (|22|. data mining 11231 . and sharing of signature [24J. MPCs have 
been used for the first time in a real- world application only in 2008, when 1,200 farmers in 
Denmark employed an MPC protocol in a nation-wide auction to determine the market price of 
sugar-beets contracts without revealing their selling and buying prices [|25l . The whole compu- 
tation took about half an hour, a satisfactory time for this application. In a different context, Il26ll 
introduces "Patient Controlled Encryption" scheme, where an electronic health record system 
allowing searches to be done on encrypted data is developed. 
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